Remove preserve_zero and zero_point_domain from choose_qparams_affine #2149

jainapurva · 2025-04-29T17:43:39Z

No description provided.

pytorch-bot · 2025-04-29T17:43:42Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2149

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

CI workflows being skipped on PR

❌ 5 New Failures

As of commit e64cf1f with merge base 137b079 ():

NEW FAILURES - The following jobs have failed:

Run Regression Tests / test (CUDA 2.5.1, linux.g5.12xlarge.nvidia.gpu, torch==2.5.1 --index-url https://download.pytorch... / linux-job (gh)
test/quantization/test_qat.py::TestQAT::test_qat_4w_quantizer
Run Regression Tests / test (CUDA 2.6, linux.g5.12xlarge.nvidia.gpu, torch==2.6.0, cuda, 12.6) / linux-job (gh)
test/quantization/test_qat.py::TestQAT::test_qat_4w_quantizer
Run Regression Tests / test (CUDA 2.7, linux.g5.12xlarge.nvidia.gpu, torch==2.7.0, cuda, 12.6) / linux-job (gh)
test/quantization/test_qat.py::TestQAT::test_qat_4w_quantizer
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh)
test/quantization/test_qat.py::TestQAT::test_qat_4w_quantizer
Run TorchAO Experimental Tests / test-cpu-ops (macos-14) (gh)
test_replace_q_dq_patterns_with_quantized_linear_ops_pass

This comment was automatically generated by Dr. CI and updates every 15 minutes.

test/quantization/test_quant_primitives.py

torchao/quantization/observer.py

jerryzh168 · 2025-04-30T02:17:55Z

torchao/quantization/quant_primitives.py

+def choose_qparams_affine_tiny_gemm(
+    input: torch.Tensor,
+    mapping_type: MappingType,
+    block_size: Tuple[int, ...],


nit: change this to Tuple[int] as well to be consistent, assuming it means the same thing

Block size is tuple with multiple integers, hence will need to do Tuple[int, ...]

jerryzh168 · 2025-04-30T02:18:29Z

torchao/quantization/quant_primitives.py

+    target_dtype: torch.dtype,
+    quant_min: Optional[Union[int, float]] = None,
+    quant_max: Optional[Union[int, float]] = None,
+    eps: Optional[float] = None,
+    scale_dtype: Optional[torch.dtype] = None,
+    zero_point_dtype: Optional[torch.dtype] = None,


I think we could probably simplify this list as well, only configurable things are needed, this can be a separate PR

jerryzh168 · 2025-04-30T02:18:42Z

torchao/quantization/quant_primitives.py

+    target_dtype: torch.dtype,
+    quant_min: Optional[Union[int, float, bool]] = None,
+    quant_max: Optional[Union[int, float, bool]] = None,
+    eps: Optional[float] = None,
+    scale_dtype: Optional[torch.dtype] = None,
+    zero_point_dtype: Optional[torch.dtype] = None,


jerryzh168 · 2025-04-30T03:54:29Z

torchao/quantization/quant_primitives.py

-        MappingType.SYMMETRIC.name,
-        MappingType.SYMMETRIC_NO_CLIPPING_ERR.name,
-        MappingType.ASYMMETRIC.name,
+        MappingType.SYMMETRIC,


if this op has to be lowered, we'd need to use str instead of enum

For all the new ops I've used MappingType enum. Should I update them to str?

I think just making sure the default one (ZeroPointDomain=INT and preserve_zero=True) can be lowered is fine for now

choose_qparams_affine_with_min_max might be lowered as well I think, but you can add a TODO and worry about this a bit later as well

jerryzh168 · 2025-05-01T23:25:02Z

torchao/quantization/quant_primitives.py

@@ -301,12 +305,6 @@ def quantize_affine(
      output_dtype (torch.dtype): requested dtype (e.g. torch.uint8) for output Tensor
      quant_min (Optional[int]): minimum quantized value for output Tensor, if not specified, it will be derived from dtype
      quant_max (Optional[int]): maximum quantized value for output Tensor, if not specified, it will be derived from dtype
-      zero_point_domain (ZeroPointDomain): the domain that zero_point is in, should be either integer or float


we should probably preserve these for now, and move to quant_api, same for the doc for peserve_zero arg

jainapurva added 2 commits April 28, 2025 13:05

Split choose_qparams_affine

b133369

Remove preserve_zero and zero_point_domain

8e4bca8

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 29, 2025

jainapurva added 4 commits April 29, 2025 12:26

Update choose_qparams_affine_min_max

a68a679

Update float8 choose_qparams

b9c7c53

Use float8 choose/quantize/dequantize

ea5525e

Updates to choose_qparams_affine uses

cff885b

jainapurva added topic: not user facing Use this tag if you don't want this PR to show up in release notes topic: for developers Use this tag if this PR is mainly developer facing labels Apr 29, 2025

jerryzh168 reviewed Apr 30, 2025

View reviewed changes

test/quantization/test_quant_primitives.py Outdated Show resolved Hide resolved

jerryzh168 reviewed Apr 30, 2025

View reviewed changes

torchao/quantization/observer.py Show resolved Hide resolved

jerryzh168 reviewed Apr 30, 2025

View reviewed changes

Test fixes

f747fff

jerryzh168 reviewed Apr 30, 2025

View reviewed changes

jainapurva added 3 commits April 29, 2025 21:27

Updates

694dab3

Updates

62a99a1

Split quantize_affine based on zero_point_domain

3a5efa7

jainapurva marked this pull request as ready for review April 30, 2025 17:36

Merge remote-tracking branch 'origin/main' into qparam_args

57d55b0

jainapurva marked this pull request as draft April 30, 2025 18:10

jainapurva added 2 commits April 30, 2025 15:00

Fix tests

6e42999

dequantize_affine and test fixes

414df66

jerryzh168 reviewed May 1, 2025

View reviewed changes

jainapurva added 4 commits May 5, 2025 11:17

Test fixes

e3f307d

Ignore quantize_pt2e until fixed

2f1ded8

Fix pt2e

f67a1f3

Ruff fixes

d2b47c4

Minor fixes

e64cf1f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove preserve_zero and zero_point_domain from choose_qparams_affine #2149

Remove preserve_zero and zero_point_domain from choose_qparams_affine #2149

jainapurva commented Apr 29, 2025

pytorch-bot bot commented Apr 29, 2025 •

edited

Loading

jerryzh168 Apr 30, 2025

jainapurva Apr 30, 2025

jerryzh168 Apr 30, 2025

jainapurva Apr 30, 2025

jerryzh168 Apr 30, 2025

jerryzh168 Apr 30, 2025

jainapurva Apr 30, 2025

jerryzh168 May 5, 2025 •

edited

Loading

jerryzh168 May 5, 2025

jainapurva May 5, 2025

jerryzh168 May 1, 2025 •

edited

Loading

Remove preserve_zero and zero_point_domain from choose_qparams_affine #2149

Are you sure you want to change the base?

Remove preserve_zero and zero_point_domain from choose_qparams_affine #2149

Conversation

jainapurva commented Apr 29, 2025

pytorch-bot bot commented Apr 29, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2149

❗ 1 Active SEVs

❌ 5 New Failures

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jerryzh168 May 5, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jerryzh168 May 1, 2025 • edited Loading

Choose a reason for hiding this comment

pytorch-bot bot commented Apr 29, 2025 •

edited

Loading

jerryzh168 May 5, 2025 •

edited

Loading

jerryzh168 May 1, 2025 •

edited

Loading